The final work is the Rmd file named “data-analysis-replication.Rmd.” The html file is just committed for reference.

Before we get started:

Here are some “housekeeping” chunks.

In this way, the packages needed should be installed before the require() function. I deleted some of the packages that I thought I would use but I didn’t. Just in case: the earlier full list is c(“tidyverse”, “dplyr”, “mosaic”, “ggplot2”, “infer”, “skimr”, “cowplot”, “broom”, “car”, “jtools”, “naniar”, “MuMIn”, “lme4”, “lmerTest”, “lmtest”, “broom.mixed”)

pkg <- c("tidyverse", "dplyr", "ggplot2", "lme4", "broom.mixed")
not_installed <- pkg[!pkg %in% rownames(installed.packages())]
if (length(not_installed) > 0) install.packages(not_installed) 

lapply(pkg, require, character.only = TRUE)
## Loading required package: tidyverse
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: lme4
## 
## Loading required package: Matrix
## 
## 
## Attaching package: 'Matrix'
## 
## 
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## 
## Loading required package: broom.mixed
## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE

1 Introduction

This is a journal article written by three researchers from University of Amsterdam, the Netherlands (Broedelet et al., 2023). It is named “Broedelet et al., 2023” in the repository/folder. Hereby, all “section” and “subsection” refers to the portions in the original article, whereas “Figure” and “Table” refer to the ones with titles in this Rmd/html file.

The study asks two major research questions.

  • The first research question question (RQ1): “Are children with developmental language disorder (DLD) less sensitive to distributional cues compared to TD children when learning novel visual object categories in an experiment?”
  • The second question (RQ2): “Does the ability of visual distributional learning contribute to lexical knowledge in children with DLD?”

Here are some terms to clarify (from both the article and me):

  • DLD: DLD is a communication disorder that interferes with learning, understanding, and using language. More information can be found here https://www.nidcd.nih.gov/health/developmental-language-disorder. The descriptive picture for the causes and symptoms for this disorder is not yet very clear, so a wide variety of researches focuses on the population with DLD and study their abilities across all linguistic domains and other related areas, in most of the cases, in comparison to typically developing (TD) children. In this paper, their object was to study the role (“nature and extent”) of statistical learning in lexical-semantic development. In the field of Linguistics, how humans acquire their first languages is also one of the biggest puzzles. Studies on pathology and populations with language impairment, in turn, throw light on the question for those who are interested in this topic.

  • Lexical-semantics: Semantic is a linguistic domain of language meanings, and lexical-semantics is a sub-field of word meanings.

  • Distributional learning (distributional cues/visual distributional learning): Some think to recognize is to categorize (e.g., Harnad et al., 2005). “Distributional learning is a form of statistical learning and entails the learning of categories based on the frequency distribution of variants in the environment.” The article specifies “a visual distributional learning task based on Junge et al. (2018) to test novel object categorization in children with and without DLD.”

Two txt format data files were published by the researchers. I saved two files as csv files. Each file targets at one research question.

df_Q1 <- read_csv("./data/CAT_Test.csv", col_names = TRUE)
## New names:
## Rows: 400 Columns: 20
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (10): Condition, StimRef, StimLeft, StimRight, Target, AnswerStim, Posit... dbl
## (10): ...1, Subject, Trial, Item, ACC, RT, AnswerStimD1, AnswerStimD2, M...
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...1`
df_Q2 <- read_csv("./data/CAT_regr.csv", col_names = TRUE)
## Rows: 25 Columns: 14
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (14): Subject, DLMeanAcc, Age_months, ActVocab_raw, DigitSpanFW_raw, Dig...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Now we have some ideas on most of the variables involved to answer RQ1 and RQ2. The ones involved my replication are:

RQ1:

  • Group: Researchers got a final sample of 25 children with DLD. 25 TD children were matched from a larger sample from previous studies where data were already collected. The children with DLD “were recruited via different institutions in the Netherlands;” the TD children were recruited from two primary schools. Their recruitment criteria are quite standard to my knowledge. For more details, please see subsection Participants under section Methodfrom the article. This is a categorical variable

  • Condition: Figure 1 demonstrate the experimental design. Participants will get familiarized with a continuum of tokens. The frequency distribution of each token are plotted in Figure 1. Each participant will be assigned to either Condition 1 (Blue) or Condition 2 (Orange). Token 4, Token 6 and Token 8 were named D1 (Deviant 1), S (Standard) and D2 (Deviant 2). For a participant who is sensitive to frequency distributional cues, they will choose that D1 and S are more alike if assigned Condition 1, and S and D2 are more alike if assigned Condition 2, due to the different arrangement of the category boundary. This is also a categorical variable.

  • Age_months: This variable is inherently treated as a predictor variable in this type of study. In this study, it is treated as a continuous variable.

  • AnswerStimD2: This is the dependent variable, i.e., choice made by participants in the tasks after familiarization with either condition. This variable, relative to Condition, depicts the distributional learning ability. This variable is in the format of binomial counts. The nature of this variable determines that for RQ1, I will likely choose glmer(family = "binomial") for the inferential analysis replication.

  • PositionD2: They suspect when participants made the choice, whether D2 token was presented on the left or right (see Figure 2) on the screen might affect their choice preference, so they added this categorical variable as the random factor.

  • Subject: This vairable is inherently treated as a random factor in this type of study.

The head() function allows us to get an overview of the dataset structure and examine all the variables as the column names

head(df_Q1)
## # A tibble: 6 × 20
##    ...1 Subject Condition   Trial  Item   ACC StimRef StimLeft StimRight Target
##   <dbl>   <dbl> <chr>       <dbl> <dbl> <dbl> <chr>   <chr>    <chr>     <chr> 
## 1     1     501 Condition 1     2     1     0 S       D2       D1        D2    
## 2     2     501 Condition 1     3     7     0 S       D2       D1        D2    
## 3     3     501 Condition 1     4     5     0 S       D2       D1        D2    
## 4     4     501 Condition 1     6     6     0 S       D1       D2        D2    
## 5     5     501 Condition 1     7     2     0 S       D1       D2        D2    
## 6     6     501 Condition 1     9     8     0 S       D1       D2        D2    
## # ℹ 10 more variables: RT <dbl>, AnswerStim <chr>, PositionD2 <chr>,
## #   AnswerStimD1 <dbl>, AnswerStimD2 <dbl>, MeanAccPP <dbl>, Gender <chr>,
## #   Group <chr>, Age_months <dbl>, Age_ym <chr>
Figure 1 Familiarization conditions in the experiment.
Figure 1 Familiarization conditions in the experiment.
Figure 2 A test question and filler/practice question.
Figure 2 A test question and filler/practice question.

RQ2:

  • DLMeanAccuracy: Different from RQ1, they use accuracy (range from 0-1) based on participants’ choices to represent the distributional learning ability (e.g., choosing D1 in a test question if assigned Condition 1 is considered accurate). It’s a transformation of earlier data (Condition, AnswerStimD1 and AnswerStimD2).

  • Children with DLD’s scores on various tasks: The study did many multiple regression analyses on multiple areas of abilities with different components (scores).

head(df_Q2)
## # A tibble: 6 × 14
##   Subject DLMeanAcc Age_months ActVocab_raw DigitSpanFW_raw DigitSpanBW_raw
##     <dbl>     <dbl>      <dbl>        <dbl>           <dbl>           <dbl>
## 1     701     0             88           21               5               3
## 2     702     1             98            8               5               4
## 3     703     0.125        104           38               8               4
## 4     705     1             98           33               7               2
## 5     706     0.375         93           21               5               2
## 6     707     0.75         111           36               5               4
## # ℹ 8 more variables: DigitSpanTot_raw <dbl>, NWR_raw <dbl>,
## #   PassVocab_raw <dbl>, NonVerbInt_raw <dbl>, WordAss_raw <dbl>,
## #   WordCatEx_norm <dbl>, WordCatRec_norm <dbl>, WordCatTot_norm <dbl>

For RQ1, they did three major analyses other than descriptive data presentation and visulizations. They are: “Split-half reliability distributional learning task,”Group comparison distributional learning task” and “Exploratory results.” The results are listed in several sub-sections under section Results. I only replicate the descriptive data visualization for RQ1 (participants’ choices under different conditions) and the second analysis, which is the most essential analysis to my judgment. In the first analysis, they concluded that the task is a reliable test. In the second analysis, they found that “school-aged children can learn novel visual object categories based on distributional properties,” but they couldn’t conclude “whether children with DLD do or do not have a distributional learning deficit.” I will replicate this analysis in addition to the visualization. In the third analysis, they only include children with DLD but couldn’t conclude whether children with DLD are able to learn novel visual object categories based on distributional information.

For RQ2, their dataset file only include children with DLD and their performance across multiple areas. They present the results of “four separate multiple linear regression analyses in R to test the relationship between distributional learning and different measures of vocabulary.” as well as some descriptive data/visualizations in subsection Regression analyses in section Results. I only replicate the descriptive data analyses and visualizations.

2 Visualization of Data

2.1 RQ1

Here is the visualization for RQ1.

Figure 3 Choice for stimulus D1 or D2 depending on condition and group.
Figure 3 Choice for stimulus D1 or D2 depending on condition and group.

I recoded the column AnswerStimD2 to enable y = count in ggplot.

df_Q1_visual <- df_Q1 %>% 
  mutate(AnswerStimD2 = case_when(AnswerStimD2 == 0 ~ "D1",
                       AnswerStimD2 == 1 ~ "D2")) %>%
  group_by(Group, Condition, AnswerStimD2) %>% 
  summarise(count = n())
## `summarise()` has grouped output by 'Group', 'Condition'. You can override
## using the `.groups` argument.

Here is my replication for Figure 3.

ggplot(df_Q1_visual, aes(x = factor(Condition), y = count, fill = factor(AnswerStimD2))) +
  geom_bar(stat="identity") +
  labs(x = "Condition", y = "Count", fill = "Answer") +
  theme_minimal() +
  facet_wrap(~Group)

Figure 3 is replicable and we can see that choice counts for D1 and D2 depending on condition and group are same in two figures. From the visualization, at least I have a sense that school-age children do make some different choices (i.e. being sensitive to ) based on conditions.

2.2 RQ2

  • DLMeanAccuracy Here is the visualization for DLMeanAccuracy.
Figure 4 Distribution of accuracy scores on the distributional learning task.
Figure 4 Distribution of accuracy scores on the distributional learning task.

Here is my replication for Figure 4.

ggplot(df_Q2, aes(x=DLMeanAcc)) +
  geom_histogram(position="dodge")+
  theme(legend.position="top")+
  labs(x = "Accuracy", y = "Count")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Again, we see Figure 4 is replicable. One take-away message from Figure 4 is that the DLD group may not perform as bad as we might expected.

  • Children with DLD’s scores on various tasks Here are two tables with other descriptive data (scores of multiple tests)
Table 1 Distribution of accuracy scores on the distributional learning task.
Table 1 Distribution of accuracy scores on the distributional learning task.
Table 2 Distribution of accuracy scores on the distributional learning task.
Table 2 Distribution of accuracy scores on the distributional learning task.

Here is my replication:

For a better visualization, I rename the column names to make the output more readable.

df_Q2_desc <- df_Q2
colnames <- c("Subject", "Accuracy", "Age months", "Productive vocabulary", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition", "Receptive vocabulary", "Raven’s progressive matrices", "Word associations", "Word classes receptive", "Word classes expressive", "Word classes total")
colnames(df_Q2_desc) <- colnames

Then I use pivot_longer() to enable summarise(). Column names go under “Task” and test scores go under “Score.”

df_Q2_desc <- df_Q2_desc %>%
  pivot_longer(cols = c("Productive vocabulary", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition", "Receptive vocabulary", "Raven’s progressive matrices", "Word associations", "Word classes receptive", "Word classes expressive", "Word classes total"), names_to = "Task", values_to="Score")

The columns of the tables I summarised are mean, standard deviation and the ranges following the column names of Table 1 and Table 2. Some elements in Table 1 and Table 2 are missing. For example, age-equivalent scores are corresponding scores on the test manuals (or other form of the norms) to look up, and those data were not included in the shared dataset. For almost all test scores, raw scores were used in further regression analyses, so those data are available in the dataset, but for there are three exceptions as explained in the original paper, For those three scores, norm scores instead of raw scores are available. I added one column in the replication to specify what type of score is available. Data from two tables (Table 1 and Table 2) are included. I added one column to specify the test category (vocabulary or control).

df_Q2_desc_sum <- df_Q2_desc %>%
  group_by(Task) %>%
  summarise(M = mean(Score), SD = sd(Score), Range = paste0(min(Score), "..", max(Score))) %>%
  mutate(Score_type = case_when(Task %in% c("Productive vocabulary", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition", "Receptive vocabulary", "Raven’s progressive matrices", "Word associations") ~ "Raw",
                       Task %in% c("Word classes receptive", "Word classes expressive", "Word classes total") ~ "Norm"))%>%
  mutate(Task_type = case_when(Task %in% c("Productive vocabulary", "Receptive vocabulary", "Word associations", "Word classes receptive", "Word classes expressive", "Word classes total") ~ "vocabulary",
                       Task %in% c("Raven’s progressive matrices", "Digit span forwards", "Digit Span Backwards", "Digit span total", "Non-word repetition") ~ "control")) %>%
  arrange(Task_type, Score_type)

df_Q2_desc_sum
## # A tibble: 11 × 6
##    Task                             M    SD Range   Score_type Task_type 
##    <chr>                        <dbl> <dbl> <chr>   <chr>      <chr>     
##  1 Digit Span Backwards          2.72  1.02 0..4    Raw        control   
##  2 Digit span forwards           5.36  1.58 3..9    Raw        control   
##  3 Digit span total              8.08  1.91 4..12   Raw        control   
##  4 Non-word repetition           3.36  2.36 0..9    Raw        control   
##  5 Raven’s progressive matrices 23.2   7.41 11..38  Raw        control   
##  6 Word classes expressive       7.24  2.63 3..12   Norm       vocabulary
##  7 Word classes receptive        6.88  2.71 1..13   Norm       vocabulary
##  8 Word classes total            6.88  2.60 2..13   Norm       vocabulary
##  9 Productive vocabulary        28.2   8.94 8..41   Raw        vocabulary
## 10 Receptive vocabulary         90.5  13.0  70..119 Raw        vocabulary
## 11 Word associations            23.9   6.37 10..42  Raw        vocabulary

Among the available data, the replication is accurate compared to Table 1 and Table 2.

3 Statistical Replications/Reanalysis

I replicated the second analysis “Group comparison distributional learning task.”

Here is a thorough description from the article in Figure 5

Figure 5 Description on group comparison analysis.
Figure 5 Description on group comparison analysis.

As earlier stated, the core function should be glmer() due to the nature of the dependent variable. We don’t have to do the first step as in the available dataset, as fillers and practices were already removed. I first processed several involved categories with as.factor() so the numeric information in Subject wouldn’t interfere with the formula, and variables are ready for further sum-to-zero orthogonal coding.

df_Q1_inf <- df_Q1 %>%
  mutate(Condition = as.factor(Condition),
         Group = as.factor(Group),
         PositionD2 = as.factor(PositionD2),
         Subject = as.factor(Subject))
str(df_Q1_inf)
## tibble [400 × 20] (S3: tbl_df/tbl/data.frame)
##  $ ...1        : num [1:400] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Subject     : Factor w/ 50 levels "501","502","503",..: 1 1 1 1 1 1 1 1 2 2 ...
##  $ Condition   : Factor w/ 2 levels "Condition 1",..: 1 1 1 1 1 1 1 1 2 2 ...
##  $ Trial       : num [1:400] 2 3 4 6 7 9 10 12 2 3 ...
##  $ Item        : num [1:400] 1 7 5 6 2 8 3 4 1 2 ...
##  $ ACC         : num [1:400] 0 0 0 0 0 0 0 0 0 1 ...
##  $ StimRef     : chr [1:400] "S" "S" "S" "S" ...
##  $ StimLeft    : chr [1:400] "D2" "D2" "D2" "D1" ...
##  $ StimRight   : chr [1:400] "D1" "D1" "D1" "D2" ...
##  $ Target      : chr [1:400] "D2" "D2" "D2" "D2" ...
##  $ RT          : num [1:400] 2287 3803 2696 3585 2224 ...
##  $ AnswerStim  : chr [1:400] "D1" "D1" "D1" "D1" ...
##  $ PositionD2  : Factor w/ 2 levels "Left","Right": 1 1 1 2 2 2 1 2 1 2 ...
##  $ AnswerStimD1: num [1:400] 1 1 1 1 1 1 1 1 0 1 ...
##  $ AnswerStimD2: num [1:400] 0 0 0 0 0 0 0 0 1 0 ...
##  $ MeanAccPP   : num [1:400] 0 0 0 0 0 0 0 0 0.75 0.75 ...
##  $ Gender      : chr [1:400] "F" "F" "F" "F" ...
##  $ Group       : Factor w/ 2 levels "DLD","TD": 2 2 2 2 2 2 2 2 2 2 ...
##  $ Age_months  : num [1:400] 100 100 100 100 100 100 100 100 95 95 ...
##  $ Age_ym      : chr [1:400] "8;4" "8;4" "8;4" "8;4" ...

It would be handy to print the original levels of these three variables before we set out to do the sum-to-zero orthogonal coding.

print(c(levels(df_Q1_inf$Condition), levels(df_Q1_inf$Group), levels(df_Q1_inf$PositionD2)))
## [1] "Condition 1" "Condition 2" "DLD"         "TD"          "Left"       
## [6] "Right"

The way I do sum coding follows the suggestions in Figure 6. The suggestions were from ChatGPT and the conversation link is provided by one of my mentors for another project.

Figure 6 Coding scheme, contrasts and interpretation of the coefficients. According to the suggestion, this is what I will do.

contrasts(df_Q1_inf$Condition) <- contr.sum(2)
contrasts(df_Q1_inf$Group) <- contr.sum(2)
contrasts(df_Q1_inf$PositionD2) <- contr.sum(2)

But Figure 5 specified further details on the contrasts. In the following chunks I followed the instructions to replicate the details. These are the products of the previous chunk for Condition

contrasts(df_Q1_inf$Condition)
##             [,1]
## Condition 1    1
## Condition 2   -1

Here are the adjusted ones.

contrasts(df_Q1_inf$Condition)[,1] = c(0.5, -0.5)
contrasts(df_Q1_inf$Condition)
##             [,1]
## Condition 1  0.5
## Condition 2 -0.5

Same for other variables.

contrasts(df_Q1_inf$Group)
##     [,1]
## DLD    1
## TD    -1
contrasts(df_Q1_inf$Group)[,1] = c(-0.5, 0.5)
contrasts(df_Q1_inf$Group)
##     [,1]
## DLD -0.5
## TD   0.5
contrasts(df_Q1_inf$PositionD2)
##       [,1]
## Left     1
## Right   -1
contrasts(df_Q1_inf$PositionD2)[,1] = c(0.5, -0.5)
contrasts(df_Q1_inf$PositionD2)
##       [,1]
## Left   0.5
## Right -0.5

The processing for Age_months is a bit different, but the idea is the same.

head(df_Q1_inf$Age_months <- scale(df_Q1_inf$Age_months, scale=FALSE))
##      [,1]
## [1,]  2.9
## [2,]  2.9
## [3,]  2.9
## [4,]  2.9
## [5,]  2.9
## [6,]  2.9

Then I write out the model (especially the formula) based on the description in Figure 6.

m_inf_0 <- glmer(AnswerStimD2~Condition*Group*Age_months + Condition*PositionD2 + (1|Subject) + (PositionD2|Subject) , data=df_Q1_inf, family = "binomial")
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues
summary(m_inf_0)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: 
## AnswerStimD2 ~ Condition * Group * Age_months + Condition * PositionD2 +  
##     (1 | Subject) + (PositionD2 | Subject)
##    Data: df_Q1_inf
## 
##      AIC      BIC   logLik deviance df.resid 
##    447.8    503.7   -209.9    419.8      386 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9109 -0.5202 -0.2974  0.5533  3.0067 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev. Corr 
##  Subject   (Intercept) 0.5610   0.7490        
##  Subject.1 (Intercept) 1.2619   1.1233        
##            PositionD21 0.4114   0.6414   -1.00
## Number of obs: 400, groups:  Subject, 50
## 
## Fixed effects:
##                               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -1.204216   0.263571  -4.569  4.9e-06 ***
## Condition1                    1.397168   0.507302   2.754  0.00589 ** 
## Group1                        0.585513   0.487481   1.201  0.22971    
## Age_months                   -0.032793   0.042056  -0.780  0.43554    
## PositionD21                   0.611823   0.360980   1.695  0.09010 .  
## Condition1:Group1            -0.002138   0.969560  -0.002  0.99824    
## Condition1:Age_months        -0.062258   0.094602  -0.658  0.51047    
## Group1:Age_months             0.009085   0.088528   0.103  0.91826    
## Condition1:PositionD21       -0.136239   0.629104  -0.217  0.82855    
## Condition1:Group1:Age_months -0.342899   0.212337  -1.615  0.10634    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Cndtn1 Group1 Ag_mnt PstD21 Cn1:G1 Cn1:A_ Gr1:A_ C1:PD2
## Condition1  -0.183                                                        
## Group1      -0.050 -0.004                                                 
## Age_months  -0.035  0.039 -0.124                                          
## PositionD21 -0.322  0.078  0.140  0.010                                   
## Cndtn1:Grp1 -0.016 -0.017 -0.171  0.196 -0.117                            
## Cndtn1:Ag_m  0.002 -0.015  0.286  0.088  0.255 -0.218                     
## Grp1:Ag_mnt -0.140  0.201  0.035  0.223  0.166 -0.032  0.094              
## Cndtn1:PD21  0.078 -0.296 -0.087 -0.019 -0.350  0.085 -0.188 -0.119       
## Cndt1:G1:A_  0.125 -0.083  0.166 -0.028  0.344 -0.168  0.437  0.250 -0.258
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## unable to evaluate scaled gradient
## Model failed to converge: degenerate  Hessian with 1 negative eigenvalues

Starting from the scratch, I’ve seen two types of warning:

  • Warning: unable to evaluate scaled gradientWarning: Model failed to converge: degenerate Hessian with 1 negative eigenvalues
  • Warning in checkConv(attr(opt, “derivs”), opt\(par, ctrl = control\)checkConv, : unable to evaluate scaled gradient

The Estimates from this model doesn’t seem to be correct either. So I searched for some info and added an optimizer based on the suggestions from https://stats.stackexchange.com/questions/164457/r-glmer-warnings-model-fails-to-converge-model-is-nearly-unidentifiable

m_inf <- glmer(AnswerStimD2~Condition*Group*Age_months + Condition*PositionD2 + (1|Subject) + (PositionD2|Subject) , data=df_Q1_inf, family = "binomial", glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000)))
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: large eigenvalue ratio
##  - Rescale variables?
summary(m_inf)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: 
## AnswerStimD2 ~ Condition * Group * Age_months + Condition * PositionD2 +  
##     (1 | Subject) + (PositionD2 | Subject)
##    Data: df_Q1_inf
## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e+05))
## 
##      AIC      BIC   logLik deviance df.resid 
##    447.8    503.7   -209.9    419.8      386 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9002 -0.5204 -0.2933  0.5531  3.0038 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev. Corr 
##  Subject   (Intercept) 0.6477   0.8048        
##  Subject.1 (Intercept) 1.1712   1.0822        
##            PositionD21 0.4132   0.6428   -0.98
## Number of obs: 400, groups:  Subject, 50
## 
## Fixed effects:
##                               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -1.205521   0.262463  -4.593 4.37e-06 ***
## Condition1                    1.396511   0.506362   2.758  0.00582 ** 
## Group1                        0.576846   0.489381   1.179  0.23851    
## Age_months                   -0.033863   0.042227  -0.802  0.42259    
## PositionD21                   0.602816   0.360218   1.673  0.09423 .  
## Condition1:Group1             0.006756   0.971582   0.007  0.99445    
## Condition1:Age_months        -0.065210   0.094284  -0.692  0.48917    
## Group1:Age_months             0.007933   0.088645   0.089  0.92870    
## Condition1:PositionD21       -0.122365   0.627926  -0.195  0.84549    
## Condition1:Group1:Age_months -0.349616   0.209860  -1.666  0.09572 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Cndtn1 Group1 Ag_mnt PstD21 Cn1:G1 Cn1:A_ Gr1:A_ C1:PD2
## Condition1  -0.180                                                        
## Group1      -0.051 -0.005                                                 
## Age_months  -0.035  0.039 -0.124                                          
## PositionD21 -0.313  0.075  0.141  0.010                                   
## Cndtn1:Grp1 -0.014 -0.017 -0.169  0.197 -0.114                            
## Cndtn1:Ag_m -0.003 -0.014  0.284  0.088  0.250 -0.212                     
## Grp1:Ag_mnt -0.145  0.204  0.033  0.223  0.164 -0.027  0.084              
## Cndtn1:PD21  0.078 -0.284 -0.087 -0.019 -0.347  0.083 -0.183 -0.117       
## Cndt1:G1:A_  0.120 -0.082  0.165 -0.031  0.338 -0.160  0.426  0.241 -0.252
## optimizer (bobyqa) convergence code: 0 (OK)
## Model is nearly unidentifiable: large eigenvalue ratio
##  - Rescale variables?

There is still a warning, but the later replication analyses yield same results as the original article results, so I decided to just keep it here.

The first prediction that they made for RQ1 is that school-aged children should demonstrate distributional learning ability, which should “manifest as a significant effect of Condition on the dependent variable.” See Figure 7.

Figure 7 Predictions and manifestations.
Figure 7 Predictions and manifestations.

Here are the confirmatory results. See Figure 8.

Figure 8 confirmatory results. Their interpretation focuses on the odds ratio change. To calculate the ORchange for Condition1, I tried to extract the estimate for Condition1. See https://difiore.github.io/ada-2024/23-module.html. Instead of $coefficients, I used fixef() as suggested in https://stackoverflow.com/questions/26417005/odds-ratio-and-confidence-intervals-from-glmer-output. In the same link, I got a line of code to generate the confidence intervals.

(ORchange_1 <- exp(fixef(m_inf)[2]))
## Condition1 
##   4.041078
CI_1 <- tidy(m_inf,conf.int=TRUE,exponentiate=TRUE,effects="fixed")
CI_1 %>%
  filter(term == "Condition1") %>%
  select(conf.low, conf.high)
## # A tibble: 1 × 2
##   conf.low conf.high
##      <dbl>     <dbl>
## 1     1.50      10.9

Thus, we replicated the results that “children in Condition 1 were 4.04 times more likely to choose stimulus D2 than children in Condition 2, and this effect was significantly above 1: z = 2.758, p = 0.006.” This is in line with their prediction.

For the second condition, I repeated the similar process.

(ORchange_2 <- exp(fixef(m_inf)[6]))
## Condition1:Group1 
##          1.006779
CI_2 <- tidy(m_inf,conf.int=TRUE,exponentiate=TRUE,effects="fixed")
CI_2%>%
  filter(term == "Condition1:Group1") %>%
  select(conf.low, conf.high)
## # A tibble: 1 × 2
##   conf.low conf.high
##      <dbl>     <dbl>
## 1    0.150      6.76

Thus we replicated the results that “although the effect of Condition was 1.007 times stronger in the TD group compared to the DLD group, this interaction between Condition and Group was not significantly above 1: z = 0.007, p = 0.994, 95% CI 0.15 .. 6.8.” Their second prediction is not confirmed.

(1/0.1499334) #conf.low
## [1] 6.669628

With this simple calculation we also replicated that “the confidence interval tells us that children with DLD could be up to 6.7 times better or 6.8 times weaker on the visual distributional learning task than TD children.” They couldn’t “conclude whether children with DLD do or do not have a distributional learning deficit.”

Here is a table of summarization for the results for two predictions.

df_Q1_inf_sum <- tidy(m_inf,conf.int=TRUE,exponentiate=TRUE,effects="fixed")
df_Q1_inf_sum %>%
  filter(term == "Condition1" | term == "Condition1:Group1") %>%
  select(term, estimate, p.value, conf.low, conf.high)
## # A tibble: 2 × 5
##   term              estimate p.value conf.low conf.high
##   <chr>                <dbl>   <dbl>    <dbl>     <dbl>
## 1 Condition1            4.04 0.00582    1.50      10.9 
## 2 Condition1:Group1     1.01 0.994      0.150      6.76

4 Summary/Discussion

According to the replication portion I did, I think the overall process is very successful. The descriptive statistical analysis and two visualizations all match the presentations in the article.

One minor problem from the descriptive statistical analysis is that for many scores, the full data for norm scores are not available (not used in further analyses and thus not included in the data file), so some elements in Table 1 and Table 2 were missing in the replication, resulting in some missing interpretations which are present in the article. Simple raw scores do not allow us to compare between children with DLD to the norms and between areas of abilities.

The biggest problem that I encounter is in the inferential statistical analysis for RQ1, where I have to reconstruct the formula and additional arguments in my code chunk as the formula was not listed and some details were still missing. They didn’t provide the details of using an optimizer (the replicated results indicate they did). I get a random piece of code resorting to stackexchange in response to the warning and mismatched results, and wasn’t confident in that. Although the nature of the assignment determines that formula being unavailable in the article is a good thing, but for replication (including design replication) purpose, it might be beneficial for the researchers to list the formula for others to take as reference and design similar experiments and analysis protocols. Overall, although the formula and other arguments to place in glmer() function wasn’t directly available, the researchers did provide very thorough descriptions on variables and effects, as well as how they apply sum-to-zero orthogonal coding. These details are extremely helpful. The paragraph in Figure 5 provides me with a good example in my future writing. Although I replicated the results presented in the original paper for the single inferential analysis I picked, I am still less confident in using the sum coding as described in the article and the interpretation (of what is set as the reference level).

The replicated results are quite expected from an academic perspective. The profiles of children with DLD could be highly variable. The nature of this diagnosis and the population might explains Figure 4 (widely distributed accuracy ranging from 0-1 in the sample) as well as other non-conclusions from the study.

5 References

5.2 Publications:

  • Broedelet, I., Boersma, P., & Rispens, J. (2023). Distributional learning of novel visual object categories in children with and without developmental language disorder. Lang. Develop. Res, 2, 1-43.
  • Harnad, S., Cohen, H., & Lefebvre, C. (2005). Handbook of categorization in cognitive science. Oxford: Elsevier, chapter Cognition is categorization. Hayes, BP (1980). A metrical theory of stress rules. Massachusetts Institute of Technology. Heald, SL, Van Hedger, SC, & Nusbaum, HC (2017). Perceptual plasticity for auditory object recognition. Frontiers in Psychology, 8(781), 131-138.
  • Junge, C., van Rooijen, R., & Raijmakers, M. (2018). Distributional information shapes infants’ categorization of objects. Infancy, 23(6), 917-926.